Biostatistics For Dummies (Monika Wahi John Pezzullo)

probabilities. After that, whenever a patient is newly diagnosed with cancer, you can take that person’s

age, stage, and grade, and generate an expected survival curve tailored for that particular patient. (The

patient may not want to see it, but at least it could be done.)

You’ll probably have to do these calculations outside of the software that you use for the

survival regression, but the calculations aren’t difficult and can be done in a Microsoft Excel

spreadsheet. The example in the following sections uses the small set of sample data that’s

preloaded into the online calculator for PH regression at

https://statpages.info/prophaz.html. This particular example has only one predictor, but

the basic idea extends to multiple predictors.

Obtaining the necessary output

Figure 23-6 shows the output from the built-in example (omitting the Iteration History and Overall

Model Fit sections). Pretend that this model represents survival, in years, as a function of age for

patients just diagnosed with some particular disease. In the output, the age variable is called Variable

FIGURE 23-6: Output of PH regression for generating prognostic curves.

Looking at Figure 23-6, first consider the table in the Baseline Survivor Function section, which has

two columns: time in years, and predicted survival expressed as a fraction. It also has four rows —

one for each time point in which one or more deaths was actually observed. The baseline survival

curve for the example data starts at 1.0 (100 percent survival) at time 0, as survival curves always do,

but this row isn’t shown in the output. The survival curve remains flat at 100 percent until year two,

when it suddenly drops down to 99.79 percent, where it stays until year seven, when it drops down to

98.20 percent, and so on.

In the Descriptive Stats section near the start of the output in Figure 23-6, the average age of the 11